Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TrackPublishHandlerAsActiveAsync closure and synchronous invocation hint #26986

Merged
merged 4 commits into from
Feb 14, 2022

Conversation

danielmarbach
Copy link
Contributor

All SDK Contribution checklist:

This checklist is used to make sure that common guidelines for a pull request are followed.

  • Please open PR in Draft mode if it is:
    • Work in progress or not intended to be merged.
    • Encountering multiple pipeline failures and working on fixes.
  • If an SDK is being regenerated based on a new swagger spec, a link to the pull request containing these swagger spec changes has been included above.
  • I have read the contribution guidelines.
  • The pull request does not introduce breaking changes.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

SDK Generation Guidelines

  • The generate.cmd file for the SDK has been updated with the version of AutoRest, as well as the commitid of your swagger spec or link to the swagger spec, used to generate the code. (Track 2 only)
  • The *.csproj and AssemblyInfo.cs files have been updated with the new version of the SDK. Please double check nuget.org current release version.

Additional management plane SDK specific contribution checklist:

Note: Only applies to Microsoft.Azure.Management.[RP] or Azure.ResourceManager.[RP]

  • Include updated management metadata.
  • Update AzureRP.props to add/remove version info to maintain up to date API versions.

Management plane SDK Troubleshooting

  • If this is very first SDK for a services and you are adding new service folders directly under /SDK, please add new service label and/or contact assigned reviewer.

  • If the check fails at the Verify Code Generation step, please ensure:

    • Do not modify any code in generated folders.
    • Do not selectively include/remove generated files in the PR.
    • Do use generate.ps1/cmd to generate this PR instead of calling autorest directly.
      Please pay attention to the @microsoft.csharp version output after running generate.ps1. If it is lower than current released version (2.3.82), please run it again as it should pull down the latest version.

    Note: We have recently updated the PSH module called by generate.ps1 to emit additional data. This would help reduce/eliminate the Code Verification check error. Please run following command:

      `dotnet msbuild eng/mgmt.proj /t:Util /p:UtilityName=InstallPsModules`
    

Old outstanding PR cleanup

Please note:
If PRs (including draft) has been out for more than 60 days and there are no responses from our query or followups, they will be closed to maintain a concise list for our reviewers.

@ghost ghost added Event Hubs customer-reported Issues that are reported by GitHub users external to the Azure organization. labels Feb 13, 2022
@ghost
Copy link

ghost commented Feb 13, 2022

Thank you for your contribution @danielmarbach! We will review the pull request and get back to you soon.

@ghost ghost added the Community Contribution Community members are working on the issue label Feb 13, 2022
@danielmarbach
Copy link
Contributor Author

@jsquire is there any reason why the TrackPublishHandlerAsActiveAsync calls invoke the On...Async methods they wrap with Task.Run? They are truly asynchronous methods, and I'm wondering why offloading is required there

Copy link
Member

@jsquire jsquire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @danielmarbach! Nice catch on that one. Please let me know if you're good with me applying the formatting suggestion and I'll get this merged in.

@jsquire
Copy link
Member

jsquire commented Feb 14, 2022

@jsquire is there any reason why the TrackPublishHandlerAsActiveAsync calls invoke the On...Async methods they wrap with Task.Run? They are truly asynchronous methods, and I'm wondering why offloading is required there

The OnXXAsync methods invoke and await user-provided code in the event handler. We can't know if the user code is truly asynchronous or is doing a heavy bit of synchronous work and just returning Task.CompletedTask at the end. The Task.Run is intended to ensure that we don't block on the event handler. We track them as active so that during shutdown we can block on them and ensure that all the handlers have completed before we finish. The idea is that we don't want to have someone call CloseAsync and we return and they terminate their application with handlers pending/running.

…fferedProducerClient.cs

Co-authored-by: Jesse Squire <jesse.squire@gmail.com>
danielmarbach and others added 2 commits February 14, 2022 17:44
…fferedProducerClient.cs

Co-authored-by: Jesse Squire <jesse.squire@gmail.com>
@danielmarbach
Copy link
Contributor Author

@jsquire is there any reason why the TrackPublishHandlerAsActiveAsync calls invoke the On...Async methods they wrap with Task.Run? They are truly asynchronous methods, and I'm wondering why offloading is required there

The OnXXAsync methods invoke and await user-provided code in the event handler. We can't know if the user code is truly asynchronous or is doing a heavy bit of synchronous work and just returning Task.CompletedTask at the end. The Task.Run is intended to ensure that we don't block on the event handler. We track them as active so that during shutdown we can block on them and ensure that all the handlers have completed before we finish. The idea is that we don't want to have someone call CloseAsync and we return and they terminate their application with handlers pending/running.

Got it. It is a pity because now we offload even for true asynchronous methods. In general, if we can avoid doing Task.Run during runtime for asynchronous method that would be hugely beneficial for the runtime behavior since we wouldn't need to go through creating pressure on the threadpool for no added value.

@jsquire
Copy link
Member

jsquire commented Feb 14, 2022

Got it. It is a pity because now we offload even for true asynchronous methods. In general, if we can avoid doing Task.Run during runtime for asynchronous method that would be hugely beneficial for the runtime behavior since we wouldn't need to go through creating pressure on the threadpool for no added value.

Agreed, but I don't have a better approach. If you've got ideas, I'd love to hear them.

@jsquire
Copy link
Member

jsquire commented Feb 14, 2022

@danielmarbach: Are you good with me merging, or do you still have some things in flight?

@danielmarbach
Copy link
Contributor Author

I'm good to go. Will put the other topic into the attic of my brain. Maybe something falls out of it with the next dedusting of the attic 😁

@jsquire jsquire merged commit e781c3a into Azure:main Feb 14, 2022
@danielmarbach danielmarbach deleted the eventhub-closure branch February 15, 2022 11:02
@danielmarbach
Copy link
Contributor Author

I'm good to go. Will put the other topic into the attic of my brain. Maybe something falls out of it with the next dedusting of the attic 😁

I have some thoughts. If I understood correctly, we are mostly concerned about making sure the publishing logic is never impacted by any delegate that might be synchronously invoked. Yet there is a maximum of one delegate that can be assigned so there is no need for multiple delegates to run concurrent. We also need to make sure to wait for outstanding delegate invocations to be called while closing and due to that in the majority of cases we end up doing the task tracking and continuation attachment plus the additional task-based machinery including an offloading to the worker thread pool per invocation.

So how about simply storing a readonly WorkStruct containing the batch list, the partition id and an optional exception inside a channel and have a dedicated background task that takes item from the channel and invokes the corresponding delegates? With that the task tracking is also no longer necessary because you can simply await the completion of the background channel reader and because the channel reader is offloaded already you don't need to pay the price of offloading every delegate invocation.

@danielmarbach
Copy link
Contributor Author

danielmarbach commented Feb 15, 2022

I realized that would be a behavior breaking change because the event handler delegate attached would then no longer be invoked concurrently but one after another so that proposal doesn't work unless that is something you are willing to break or introduce a flag on the options

@jsquire
Copy link
Member

jsquire commented Feb 15, 2022

there is no need for multiple delegates to run concurrent.

That's not quite a correct statement. Partitions are independent and the delegate runs concurrently with a default of "limit to one invocation per partition" - but this is configurable. It is possible (and likely for high-throughput scenarios) to allow concurrent execution of partitions. We cap everything to the maximum concurrency option, but it's fair to say in most systems that you'll have several callbacks being executed concurrently.

So how about simply storing a readonly WorkStruct containing the batch list, the partition id and an optional exception inside a channel and have a dedicated background task that takes item from the channel and invokes the corresponding delegates?

It's an interesting thought, but I'm concerned that will create a throttle point. For scenarios where we have a lot of small events being sent frequently, especially when concurrency is dialed up, its a very real possibility that we'd end up queuing a chain of callbacks that we can't keep up with. Either the channel would have to grow unbounded, or we'd start blocking when things get full.

If we start seeing scheduling issues due to the backgrounding of handlers, our best bet is probably to consider awaiting the invocation at the call site. That would keep us with a known degree of concurrency and shift responsibility to developers - if you want your publishing to run quickly, do less in your handlers. We considered that, but instead kept the default degree of concurrency constrained to avoid pressure on the thread pool.

Maybe, ultimately, that's the way to go. We haven't seen anything crazy in testing, but we're also still pending stress runs.

@danielmarbach
Copy link
Contributor Author

The other option would be to not wrap the event invocation with Task.Run and still fire&forgot the execution. Then all true async delegates are executed concurrent without creating threadpool pressure and then have guidance showning how to offload synchronous invocation in the handler if desired. But I understand this might cross a usability threshold that makes the library hard to crasp for certain users which is probably not something you want to go for

@jsquire
Copy link
Member

jsquire commented Feb 15, 2022

The other option would be to not wrap the event invocation with Task.Run and still fire&forgot the execution. Then all true async delegates are executed concurrent without creating threadpool pressure and then have guidance showning how to offload synchronous invocation in the handler if desired. But I understand this might cross a usability threshold that makes the library hard to crasp for certain users which is probably not something you want to go for

It's worth considering, for sure, but you nailed the primary concern. From observations during user studies, I don't think the "sync until you await something" behavior is well understood. There seems to be a common belief that anytime you return Task, you are executing asynchronously.

@jsquire
Copy link
Member

jsquire commented Feb 17, 2022

@danielmarbach: Thank you for the discussion here. I've been kicking this around the last couple of days and it came up in a similar conversation. I think that we originally underestimated the value of having determinism for the timing of when the success/fail handler is invoked after the send and ensuring the order they'll fire with a single degree of concurrency.

I'm going to change the implementation to invoke the handler and await at the completion of the Send. We'll try adding guidance for "if you want to maximize throughput, don't do a lot of direct work in your handler."

Update

Changed in #27173.

@jsquire jsquire added this to the [2022] March milestone Mar 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Community Contribution Community members are working on the issue customer-reported Issues that are reported by GitHub users external to the Azure organization. Event Hubs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants